In silico analyzing the molecular interactions of plant-derived inhibitors against E6AP, p53, and c-Myc binding sites of HPV type 16 E6 oncoprotein

Human papillomaviruses (HPV) are a group of strong human carcinogen viruses considered to be the fourth leading cause of mortality among women in the world. HPV is the most important cause of cervical cancer, which is the second most common cancer in women living in low and middle-income countries. To date, there is no effective cure for an ongoing HPV infection; therefore, it is required to investigate anticancer drugs against this life-threatening infection. In this study, we collected more than 100 plant-derived compounds with anti-cancer and antiviral potentials from a variety of papers. Smile formats of these compounds (ligand), were harvested from PubChem database and examined based on the absorption, distribution, metabolism, excretion, and toxicity properties by programs such as Swiss ADME, admetSAR, and pkCSM. Twenty compounds, which were likely to be the HPV16E6 inhibitor, were selected for docking calculations. We examined these natural inhibitors against the HPV16 E6 oncogenic protein. Eventually, three of these compounds were used as the most potent inhibitors (Ginkgetin (peculiarly), Hypericin and Apigetrin) were probably used as the possible source of cancer treatment caused by E6 oncoprotein. In this research, we conducted the docking calculations by Autodock 4.2.6 software. Docking analysis showed the interaction of these plant-originated inhibitors with E6AP, p53, and Myc binding sites on the E6 oncoprotein which support the normal function of E6AP, p53, and Myc.


INTRODUCTION
Over 100 types of cancers affect humans, which is the second-leading cause of mortality worldwide and responsible for 8.8 million deaths in 2015. Expected by 2020, this cancer causes up to 10 million deaths (World Health Organization report) [1]. Some researchers believe that most cancers (about 90-95%) are due to genetic mutations caused by environmental factors and lifestyle, and the remaining 5-10% are due to inherited genetics [2,3]. Worldwide, approximately 18 percent of mortality from cancer are related to infectious agents [4]. Inhibiting the activity of major transcription factors has shown to be a high potential approach for cancer treatment [5,6].
Papillomaviruses are a group of the most commonly found viruses that could cause cancer in humans. It has been reported that some strains of human papillomavirus (HPV) cause cervical cancer [7]. Over 200 different HPV types have been identified to infect human and are classified into two different groups: high risk and low risk [8]. High-risk genotypes, including HPV16, 18, 45 and 33 (63,11,6, and 4%, respectively) are more commonly associated with cervical cancers, whereas low-risk types such as HPV6 and 11, typically cause genital warts [9,10].
High-risk types of HPV are the causative agents of over 99.7% of cervical cancers [11]. Human papillomavirus type 16 is known as a major causative factor in the development of cervical carcinomas [12]. Estimates of the global cancer burden indicate that cervical cancer is increasingly prevalent in low-and middle-income countries [13][14][15][16][17].
HPVs are a group of small non-enveloped, icosahedral tumor viruses that possess a circular double-stranded DNA genome with a size of 55 nm in diameter that infects cutaneous or mucosal epithelial cells, causing papillomata or warts on the skin, genital tissues, and the upper respiratory tract [18]. The HPV genome consists of approximately 8000 base pairs [19]. Functionally, the genome of the HPV is divided into three distinct regions. The first long control region (LCR or URR (upstream regulatory region)) is responsible for the regulation and control of viral DNA replication and transcription. The early region contains six ORFs (open reading frames) and encoding non-structural viral regulatory proteins and viral replication (E1, E2, E4), three of which, E5, E6, and E7, are oncogenic. The late genes region encodes structural proteins involved in the formation of the viral capsid proteins L1 and L2, and varies between different HPV types, which is necessary for virion transmission and spread [20,21].
The E6 is one of the two oncoproteins expressed in the oncogenic human papillomavirus types 16 and 18, which plays a major role in malignant transformation, carcinogenicity, and immortality [22][23][24]. The E6 is an oncoprotein composed of polypeptides made up of 150 amino acids long with a molecular weight between 16-18 kDa and two Cys-X-X-Cys motifs that allow the formation of two zinc fingers [25,26]. E6 oncoproteins are expressed as a full-length (16-18 kDa) protein as well as its splice isoform E6* (7 kDa), in the host cells [27]. It has been shown that the expression of the HPV16 E6* isoform increases oxidative stress and induces oxidative DNA damage in the host cells [28,29].
P53 is the most important human protein, which is involved in cell cycle regulation, apoptosis, as well as playing a significant role in maintaining genetic stability conditions. When the human cell's genome is damaged for any cause, and these damaged cells do not enter the cell proliferation cycle, the rate of p53 increases rapidly and induces apoptosis of the damaged cells [21]. High-risk E6 oncoproteins mediate the ubiquitination and cellular ubiquitin ligase protein called E6AP (E6-Associated Protein) that catalyzes the degradation of p53 through the formation of a heterotrimeric complex (E6/E6AP/p53). Direct binding between E6, p53, and E6AP forms this complex. [33]. When the content of p53 decreases, after degradation by E6 oncoprotein of the host cells, they cannot ordinarily repair DNA damages, which ultimately leads to malignancy.
Myc (transcription factor) is one of the E6 partners that regulates the expression of up to 10-15% of the cellular genes controlling metabolic processes, post-translational modifications, macromolecular synthesis, cellular proliferation, and apoptosis [34]. hTERT (human telomerase reverse transcriptase ), the catalytic subunit of telomerase, is one of the Myc targets [35]. Telomerase is a complex of ribonucleoprotein enzymes, consisting of two core subunits, called template RNA subunit (human telomerase RNA (hTR)) and a catalytic protein subunit (hTERT), that extends telomeric DNA. The telomere DNA is a repetitive and nucleoprotein complex located at the ends of eukaryotic chromosomes with approximately 5,000 to 15,000 nucleotides length in humans. Without the telomerase activity, the linear chromosomes of cellular DNA serially are shortened with each cell cycle and get divided by 100 to 200 nucleotides, then cells enter the mortality stage [35,36]. New studies have shown that cells infected with E6 protein hr-HPV use c-Myc and directly interact with hTERT, leading to increased phosphorylation of RNA Pol II and induced epigenetic histone modifications of the hTERT promoter, therefore it triggers the activation of the hTERT promoter, which increases the telomerase activity [37]. Increased expression of hTERT in the most human cancers, including HPV-positive cervical cancer, prevents the shortening of telomere length, leading to cell immortalization.
In this study, twenty different natural compounds from several plant sources for molecular docking calculations were collected (Supplement file, Table S1), and we investigated the molecular interaction of the HPV16 E6 protein with plant-originated inhibitors around E6AP, p53, and Myc binding site respectively.
The compounds were not supposed to inhibit cytochromes (CYP3A4, CYP2D6, CYP2C9, CYP2C19, and CYP1A2) and have high water solubility )Log s<6), Log p be in the range of -0.7 < log p < 5, and TPSA be ranging 20 < TPSA < 130 Å². After the filtering operation by online servers, 20 of the 100 natural compounds were selected for molecular docking calculations (Table S1). Chemical structures of twenty natural compounds were retrieved from the PubChem database. The structures were downloaded in smiles format and converted to PDB using Chimera 1.12 then PDB format's structures were minimized by chimera 1.12 and were saved in PDB files. PDB file was submitted to ADT (AutoDock Tools is a set of commands implemented within the Python Molecular Viewer (PMV)) for PDBQT file preparation. Also, docking calculations were carried out on ligands by ADT (1.5.6), AutoDock4.2 program operated Docking calculation. Three-dimensional structure of E6 (PDB ID: 4GIZ) was retrieved from the PDB database. Initially, extra chains (A, B, and D), water molecules, and ligands of PDB file (4GIZ) retrieved from Protein Data Bank were removed by MVD. Finally, the C chain (4GIZ) was submitted to the MVD for optimizing/repair, then the PDB file was evaluated and passed to Autodock Tools (ADT ver.1.5.6) for PDBQT File preparation. Therefore, non-standard residues were removed. Only polar hydrogens were maintained, and Gasteiger charges were computed for protein atoms, Kollman charges were also assigned by ADT, The PDB file was prepared for Docking with Autodock 4.2.6. Initially, for the studies of molecular Docking simulations E6 with Myc, we predicted Myc 3D protein structure by online servers. Since the three-dimensional structure (3D) of Myc retrieved from the protein data bank, (ID: 1NKP) had a problem, owing to four amino acids (30, 31, 32, and 33) to be in the error region.
The Myc Ref.sequence (NP_002458.2) was harvested from NCBI. For the modeling of the 3D structure of the Myc protein, Phyre 2 server was used, and the result of the Phyre 2 server query was as following: coverage of the target-template alignment was 99.61% confidence with 98% identity. Structural refinement and energy minimization of the predicted model was carried out using YASARA Energy Minimization program. The total energy minimization for the refined structure obtained from the YASARA program was -66671.8 kJ/mol (score, 2.40), whereas, before energy minimization, it was -50950 kJ/mol (score, 0.52). The refined model reliability was assessed through Procheck, ProSA-Web, ProQ and, Verify-3D. Finally, Myc 3D structure was refined for further confirmation by the ERRAT online server. The Ramachandran plot for the Myc 3D structure was created by Phyre 2 server was as following: the 97.6% residues were in the favorable regions, 2.4% residues were in the allowed region, and no amino acids were found in the outlier region (Fig. 1). Determining the binding sites of E6 to E6AP, p53, and Myc: We explained the atomic interaction between plant-originated ligands and high-risk HPV16 E6 oncogenic protein. These natural plant compounds might prevent and disable E6 oncoprotein from interacting with E6AP, p53, and Myc proteins. However, when E6 interacts with host E6AP, p53 and Myc protein, it plays important roles in the induction of cell immortality All residues were involved in the binding of E6 to E6AP consist of R10, K11, V31, Y32, L50, C51, V53, R55, V62, L67, Y70, S74, R77, H78, L100, R102, Q107, R129, and R131. Residue interface in hydrophobic interactions such as: V31, Y32, L50, C51, V53, V62, L67, Y70, L100, R102, Q107, R131 and residue interface in polar interactions involve side-chain groups such as R10, R55, S74, R77, H78, R129, R131 and residue interface in polar interactions involve backbone groups (main chain) such as K11, C51, R102, and R131 [54]. Autodock calculated the region of interest for docking runs; the grid map was formed based on polar residues in the binding site of receptor (E6 oncoprotein) proteins to E6AP for the docking process. The dimension of the grid map was calculated based on polar residue in the binding site using AutoGrid (part of the AutoDock package) and drawn. The dimension of the grid map was given as x-dimension = 56, number of points in y-dimension = 50 and number of points in z-dimension = 86 points in a grid space of 0.375 Å, centers of grid box: X = 2.851; Y = 50.522; Z = 27.126. All residues were involved in the binding of E6 to p53 including Q6, E7, R8, R10, Q14, E18, Y43, D44, F47, D49, L100, and P112. Residue interface in hydrophobic or non-polar interactions were including Q6, E7, R8, R10, Q14, E18, Y43, D44, F47, D49, L100, and P112. Residue interface in polar interactions involve side-chain groups were including E7, R10, Q14, E18, D44, D49 and residue interface in polar interactions involve backbone groups (main chain) were including Q6, E7, R8, R10, and E18 [55].
AutoGrid was used to create a grid map; the grid map was formed based on polar residues in the binding site of receptor (E6 oncoprotein) proteins to p53 for docking process.  Table 1.
The PDB files (receptor and ligands) prepared for docking operation were submitted to ADT for PDBQT file preparation. Docking calculations were carried out on receptors such as delete water molecules, non-standard residues were removed, only polar hydrogen was maintained, and Gasteiger charges were computed for protein atoms, Kollman charges were also assigned. As well as Docking, calculations on ligand such as computing Gasteiger charges, merged non-polar hydrogens, found aromatic carbons, detected rotatable bonds and set TORSDOF carried out by ADT automatically. After docking calculations, PDBQT files were saved for docking computations with the AutoDock4.2 program. The grid maps representing the receptor proteins in the docking process were calculated (x, y, and z dimension) using AutoGrid (see above). The Lamarckian genetic algorithm (LGA) was applied to the interaction pattern between the receptor and selected natural metabolite inhibitors (ligand), with 100 genetic algorithms (GA) runs. Other settings for the Docking process was considered a default, such as the population size of 150 was determined for each 25 × 105 energy evaluations, 27,000 maximum number of generations, 0.02 rate of gene mutation, and a crossover rate of 0.8 were used for the LGA.

RESULTS
Docking analysis showed that all 20 natural ligands bind to three binding sites on HPV E6 oncoproteins that can help the restoration of the normal functioning of tumor suppressor proteins, and the lowest binding energy conformation was analyzed and presented in Table 1. It was showed that interaction of all 20 natural ligands with HPV E6 oncoproteins, and among them, Ginkgetin (GK) (especially), Hypericin, and Apigetrin have effectively inhibited three binding sites on HPV E6 oncoproteins with minimum binding energy.
Among the three natural ligands (GK, Hypericin, and Apigetrin) selected, GK most effectively with the lowest binding energy interacted with all three binding sites E6AP, p53, and myc on E6 oncoproteins. (The three-dimensional structure of these interactions are shown in Figure 2 (A, B, and C respectively). GK showed the lowest binding energy (-8.45 kcal/mol) with the E6AP binding site on HPV-16 E6 protein and inhibition constant (0.642 μM) for the protein-ligand complex. The four amino acid residues of HPV-16 E6, (i.e., Arg 55, Cys 51, Val 53, and Tyr 60) were observed to form five hydrogen bonds with GK during protein, ligand interactions (Fig. 3A). Similarly, the binding energy of GK with the p53 binding site on HPV-16 E6 protein was showed to be a minimum binding energy of -8.46 kcal/mol with an inhibition constant of 0.632 μM. GK formed five hydrogen bonds with four amino acid residues (i.e.; Arg 55, Cys 51, Val 53, and Tyr 60) from the HPV-16E6 protein (Fig. 3B). In the case of the binding energy of GK with Myc binding site on HPV-16, E6 protein interacted with two amino acid residues from the receptor (i.e., Pro 5, and Arg 8) by forming four hydrogen bonds (Fig. 3C), the binding energy of the interaction was -7.22 kcal/mol, and the inhibition constant was 5.11 μM ( Table 1).
The study showed that GK is an effective inhibitor of the binding sites available on the E6. This computational approach demonstrates the effectiveness of GK as an anticancer agent that needs to be explored further until we learn more about how to use GK or others natural products for designing novel drugs against cervical cancer.  [1]. Carcinogenic viruses, especially oncogenes produced by them with several different molecular mechanisms modify normal cells to cancer cells. Viral oncoproteins like E6, form complexes with cellular proteins, causes broad biological changes including; regulating cellular pathways, preventing shortening of the telomeres, immortalization, host cell differentiation, regulating growth factors, degradation and inactivation tumor suppressor, interference with DNA repair efficiency, and apoptosis, promote cell transformation, increase of the host genetic and epigenetic alterations etc. In addition, carcinogenic viruses cause genetic disorders by entering their genome into host cell chromosomes [30,56,57]. Although several viruses can cause various tumors in animals, only seven kinds of them are linked with cancer in humans [1,58].
Cervical cancer is one of the most dangerous and deadly cancers in women caused by HPV. There are several options for the treatment of early-stage cervical cancer such as surgery, nonspecific chemotherapy, radiation therapy, laser therapy, hormonal therapy, targeted therapy, and immunotherapy, but there is no effective cure for an ongoing HPV infection. Herbal extracts are one of the therapeutic areas for cervical cancer; many researchers have studied the effect of plant metabolites on cervical cancer treatment. Researchers have demonstrated that curcumin, epigallocatechin-3-gallate (EGCG), jaceosidin, resveratrol, indole-3-carbinol, withaferin A, artemisinin, ursolic acid, ferulic acid, berberin, resveratrol, gingerol, and silymarin compounds are possible effective agents for cancer treatment [59,60]. Traditionally, different plant-originated compounds have been identified and tested as promising resources against cancer caused by HPV.
Advancements in computational biology and bioinformatics are useful to the investigation of novel inhibitors from herbal medicines against cancers like cervical cancer. Computer-aided docking is a valuable tool for gaining an understanding of the binding interactions between a ligand (small molecule) and its receptor (macromolecule) and has emerged as a reliable, costeffective, time-saving and fast technique for the discovery of novel drugs [61]. Molecular docking provides the following three main goals: predicting the ligand binding site to the receptor, virtual screening, and binding affinity calculation [62]. Molecular docking studies further help to understand the various interactions between the small molecules and particular receptor targets binding sites and is used as a standard computational tool to design novel potent inhibitors.
The high-risk HPV type 16 has oncoprotein (E6), that need to stay blocked for various reasons, including the fact that E6 is able to be inactivated tumor suppressor proteins (p53) by the E6AP human protein, then causes p53 degradation by the proteasome pathway. In addition, E6/ Myc interactions cause to induce the transactivation hTERT promoter and the increase of telomerase activity, finally leading to tumor cells immortalization. Therefore, E6 oncoprotein is a very important goal in designing new inhibitors against cervical cancer.
GK is a natural bioflavonoid isolated from leaves of Ginkgo biloba (Ginkgoaceae). Ginkgo biloba (GB) is originated in Asia, particularly in southeast China, where more than 4000 years it has been used as traditional herbal medicine. Extracts from GB leaves contain various glycosides and terpenoids and have been found to exert different pharmacological actions. Among several compounds, isolated from GB leaves, GK has shown various biological functions including the potent anti-inflammatory and anti-viral, antifungal, neuroprotective, anti-influenza, anti-arthritic and antitumor activities [63]. Researchers have reported that GK inhibits the growth of prostate and breast cancer cells [64,65], and inhibits the proliferation of human leukemia cells [66]. Also, GK induces autophagy and apoptosis in cancer cells [67,68]. In the present study, bioinformatics simulation showed the anticancer properties of GK against cervical cancer. Generally, GK by blocking the binding site of E6 oncoprotein to E6AP, p53, and Myc, will support the normal function of these intracellular proteins.